Adaptive Selection of Communication Methods to Optimize Collective MPI Operations
نویسندگان
چکیده
Many parallel applications from scientific computing use collective MPI communication operations to distribute or collect data. The execution time of collective MPI communication operations can be significantly reduced by a restructuring based on orthogonal processor structures or by using specific point-topoint algorithms based on virtual communication topologies. The performance improvement depends strongly on numerous factors, like the collective MPI communication operation, the specific group layout, the message size, the specific MPI library, and the architecture parameters of the parallel target platform. In this paper we describe an adaptive approach to determine and select a specific processor group layout or communication algorithm for the realization of collective communication operations with the objective of minimizing the communication overhead. In the case that a communication method is faster than the original implementation of the collective MPI communication operation, the specific communication method is applied to perform the communication operation.
منابع مشابه
Evaluating MPI Collective Communication on the SP2, T3D, and Paragon Multicomputers
We evaluate the architectural support of collective communication operations on the IBM SP2, Cray T3D, and Intel Paragon. The MPI performance data are obtained f o m the STAP benchmark experiments jointly performed at the USC and HKU. The T3D demonstrated clearly the best timing performance in almost all collective operations. This is attributed to the special hardware built in the T3D for fast...
متن کاملHigh-Level Topology-Oblivious Optimization of MPI Broadcast Algorithms on Extreme-Scale Platforms
There has been a significant research in collective communication operations, in particular in MPI broadcast, on distributed memory platforms. Most of the research works are done to optimize the collective operations for particular architectures by taking into account either their topology or platform parameters. In this work we propose a very simple and at the same time general approach to opt...
متن کاملPGMPI: Automatically Verifying Self-Consistent MPI Performance Guidelines
The Message Passing Interface (MPI) is the most commonly used application programming interface for process communication on current large-scale parallel systems. Due to the scale and complexity of modern parallel architectures, it is becoming increasingly difficult to optimize MPI libraries, as many factors can influence the communication performance. To assist MPI developers and users, we pro...
متن کاملTopology-oblivious optimization of MPI broadcast algorithms on extreme-scale platforms
Article history: Available online xxxx Keywords: MPI Broadcast BlueGene Grid'5000 Extreme-scale Communication Hierarchy a b s t r a c t Significant research has been conducted in collective communication operations, in particular in MPI broadcast, on distributed memory platforms. Most of the research efforts aim to optimize the collective operations for particular architectures by taking into a...
متن کاملExploiting Hierarchy in Parallel Computer Networks to Optimize Collective Operation Performance
The eÆcient implementation of collective communication operations has received much attention. Initial e orts modeled network communication and produced \optimal" trees based on those models. However, the models used by these initial e orts assumed equal point-to-point latencies between any two processes. This assumption is violated in heterogeneous systems such as clusters of SMPs and wide-are...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005